95 research outputs found
Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion
Stochastic Human Motion Prediction (HMP) aims to predict multiple possible
upcoming pose sequences based on past human motion trajectories. Although
previous approaches have shown impressive performance, they face several
issues, including complex training processes and a tendency to generate
predictions that are often inconsistent with the provided history, and
sometimes even becoming entirely unreasonable. To overcome these issues, we
propose DiffMotion, an end-to-end diffusion-based stochastic HMP framework.
DiffMotion's motion predictor is composed of two modules, including (1) a
Transformer-based network for initial motion reconstruction from corrupted
motion, and (2) a Graph Convolutional Network (GCN) to refine the generated
motion considering past observations. Our method, facilitated by this novel
Transformer-GCN module design and a proposed variance scheduler, excels in
predicting accurate, realistic, and consistent motions, while maintaining an
appropriate level of diversity. Our results on benchmark datasets show that
DiffMotion significantly outperforms previous methods in terms of both accuracy
and fidelity, while demonstrating superior robustness
DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning
A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a
melodic framework for compositions and improvisations alike. Raga Recognition
is an important music information retrieval task in ICM as it can aid numerous
downstream applications ranging from music recommendations to organizing huge
music collections. In this work, we propose a deep learning based approach to
Raga recognition. Our approach employs efficient pre possessing and learns
temporal sequences in music data using Long Short Term Memory based Recurrent
Neural Networks (LSTM-RNN). We train and test the network on smaller sequences
sampled from the original audio while the final inference is performed on the
audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during
inference on the Comp Music Carnatic dataset and its 10 Raga subset
respectively making it the state-of-the-art for the Raga recognition task. Our
approach also enables sequence ranking which aids us in retrieving melodic
patterns from a given music data base that are closely related to the presented
query sequence
- …